Members
Overall Objectives
Application Domains
New Software and Platforms
New Results
Partnerships and Cooperations
Dissemination
Bibliography
XML PDF e-pub
PDF e-Pub


Section: New Results

PEPSI-Dock : Fast predictions of putative docking poses using accurate knowledge-based potentials functions to describe interaction between proteins

Participants : Emilie Neveu, Sergei Grudinin, David Ritchie, Petr Popov.

Many biological tasks involve finding proteins that can act as an inhibitor for a virus or a bacteria, fir example. Such task requires knowledge on the structure of the complex to be formed. Protein Data Bank can help but only a small fraction of its proteins are complexes [16] . Therefore, computational docking predictions, being low-cost and easy to perform, are very attractive if they describe accurately the interactions between proteins while being fast to find which conformation will be the most probable. We have been developing a fast and accurate algorithm that combines the FFT-accelerated docking methods with the precise knowledge-based potential functions describing interactions between the atoms in the proteins.

Docking methods can be described as a two ingredients recipe. First, a certain approximation for the binding free energy needed to describe the interactions between the proteins. Second, an efficient sampling algorithm is used to find the lowest-energy conformations. Commonly, as going through all the possibilities with a realistic energy function is very costly, it is approximated with a very simple energy function. Then, a much more precise energy function is typically used to re-score the most promissing predictions.Considering the numerous local minima that can be found, it is important to use the most accurate free energy from the beginning not to miss some important docking solutions. In the Hex code, an exhaustive search combined with a spherical polar Fourier representation enables the fast exploration of all the conformations. By now it is still the most efficient and reliable search algorithm [21] . However, only a few types of energies have been accelerated using this technic (shape complementarity and electrostatics, for example). Knowledge-based potential functions are much more precise but have been used only at the re-scoring stage of the protein docking predictions pipeline. Thus, our aim is to take advantage of the fast exhaustive search by integrating the very-detailed knowledge-based potentials into the Hex exhaustive search method.

We have demonstrated that we can adapt the machine learning process so that the knowledge-based potentials describing atom interactions can be translated into the polynomial basis used in Hex. Then, the knowledge-based scores are calculated in Hex using the fast polynomial expansions accelerated by the fast Fourier transform. The current evaluations of the knowledge-based scores takes more time than a shape+electrostatic representation but is still fast. More precisely, docking predictions for a single complex takes on average 5-10 minutes on a regular laptop computer. The preliminary results on the data set used for training shows significant improvements in accuracy of the method. Indeed, considering the prediction is correct if its Root Mean Square distance from the true solution is smaller than 5 Å, we currently obtain more than 50% of correct predictions rank first.